Journal of Theoretical Biology
○ Elsevier BV
Preprints posted in the last 7 days, ranked by how well they match Journal of Theoretical Biology's content profile, based on 144 papers previously published here. The average preprint has a 0.06% match score for this journal, so anything above that is already an above-average fit.
Filippini, S.; Ridolfi, L.; von Hardenberg, J.
Show abstract
Patterns in the vegetation across arid and semiarid regions may be explained as a form of self-organization driven by water scarcity, and are often modeled through reaction-diffusion dynamics. Recent work has shown that similar mathematical models generate patterns on networks. However, these studies have focused on idealized topologies with no reference to natural pattern-forming systems. Our study aims at bridging these two fields: we employ a physical reaction-diffusion vegetation model, and gradually modify the topology of the diffusion network by adding random shortcuts over a 2-dimensional grid, interpolating between a regular lattice and a random network. We found that network topology strongly shapes both the resulting vegetation patterns and the precipitation range that supports them. Three behavioral regimes emerge. On a regular lattice, high-regularity patterns develop reflecting local diffusion processes. On a random network, the system is dominated by global pressure towards homogenization yielding either a uniform state or a single patch. In the intermediate shortcut density range, as the network topology resembles a small world network, the interaction between the two scales of diffusion generates two kinds of disordered patterns: low-regularity patterns with a well-defined characteristic wavelength, and irregular patterns characterized by a broad patch size distribution. These disordered patterns resemble real-world observations and, in our model, they show different responses to changing precipitation. Although we focused on dryland vegetation, we suggest that network-mediated diffusion could lead to similar mechanisms in a wide variety of pattern-forming systems. HighlightsO_LIWe study vegetation pattern formation over different diffusion network topologies. C_LIO_LITwo kinds of stable disordered patterns states develop over small world topologies. C_LIO_LILow-regularity patterns with a well-defined characteristic wavelength. C_LIO_LIIrregular patterns characterized by a broad patch size distribution. C_LIO_LIThese different kinds of disordered states show different relations to precipitation. C_LI
Revell, L. J.; Alencar, L. R. V.; Alfaro, M. E.; Dain, J.; Hill, N. J.; Jones, M.; Martinet, K. M.; Romero-Alarcon, V.; Harmon, L. J.
Show abstract
The practical utility of many modern phylogenetic comparative methods can depend on how accurately mathematical models capture the evolutionary process of traits. Boucher and Demery (2016) described a new quantitative trait model, Brownian motion with reflective limits, that they anticipated might be of use in testing hypotheses about a particular sort of constraint on phenotypic character evolution. Since their analytic solution for the probability function under this bounded evolutionary scenario was not practical to evaluate for reasonably-sized trees, Boucher and Demery (2016) also identified a creative technique for computing the likelihood of their model. The basis of this methodology derives from the convergence of an equal-rates, symmetric, ordered Markov chain and continuous stochastic diffusion in the limit as the number of steps in our chain goes to {infty} (or, alternatively, as their widths decrease towards zero). We refer to this convergence in the limit as the discretized diffusion approximation or (more compactly) the discrete approximation. We realized that this discrete approximation of Boucher and Demery (2016) unlocked a number of additional models for the phylogenetic comparative analysis of discrete and continuous trait data, and we explore several of these in the present article. Specifically, we examine application of this discretized diffusion approximation to the threshold model from evolutionary quantitative genetics, to a new "semi-threshold" trait evolution model, to a joint model of discrete and continuous traits in which the discrete trait influences the rate of evolution of our continuous character, as well as a model where precisely the converse is true, and to a discrete character dependent multi-trend trended continuous trait evolution model. We conclude with some context for the origins of our article and discussion of other possible applications of this powerful approach.
Gantenberg, J. R.; La Joie, R.; Heston, M. B.; Ackley, S. F.
Show abstract
Qualitative models of Alzheimers pathology often posit that amyloid accumulation follows a sigmoid curve, indicating that the rate of deposition wanes over time. Longitudinal PET data now allow us to investigate amyloid accumulation trajectories with greater detail and over longer follow-up periods. We combine inferences from simulated amyloid trajectories, empirical PET data from the Alzheimers Disease Neuroimaging Initiative (ADNI), and the sampled iterative local approximation algorithm (SILA) to assess whether amyloid accumulation reaches a physiologic ceiling. We find that SILA reliably detects a ceiling, when present, across a range of simulated scenarios that impose a sigmoid shape. When fit to empirical data from ADNI, however, SILA does not appear to indicate the presence of a ceiling. Thus, we conclude that amyloid trajectories may not reach a physiologic ceiling during the stages of Alzheimers disease typically observed while patients remain under follow-up in cohort studies. Fits using SILA indicate that illustrative models of biomarker cascades, while useful tools for conceptualizing and interrogating pathologic processes, may not represent the shapes of amyloid trajectories accurately. Summary for General PublicAmyloid, a protein implicated in Alzheimers disease, is thought to reach a plateau in the brain, but methods that estimate how amyloid changes over time suggest it grows unabated. Gantenberg et al. use one such method and simulations to argue that amyloid does not reach a plateau during the typical course of Alzheimers.
AZOTE epse HASSIKPEZI, S.; Negi, R. S.; Chen, N.; Manning, M. L.
Show abstract
Stratified epithelial tissues such as the skin epidermis maintain barrier integrity during development and homeostasis through the coordinated action of cell proliferation, differentiation, delamination, and tissue-scale mechanical forces. During development, the orientation of cell division within the basal layer plays a pivotal role in tissue stratification; however, the mechanical principles linking the orientation of the division plane to these processes across developmental stages remain poorly understood. Here, we expand a recently developed three-dimensional vertex model for stratified epithelia, composed of the basement membrane, basal, and suprabasal layers, to study the mechanical and structural impact of cell divisions with a wider range of orientations. The model integrates developmental stage via specific changes in heterotypic interfacial tensions (arising from actomyosin cortical contractility and adhesion molecules at the basal-suprabasal interface) and tissue stiffness that have been quantified previously in experiments. By systematically varying background mechanical parameters, we investigate how heterotypic tension, division orientation, and tissue fluidity collectively influence the outcome of cell division. Our goal is to uncover the strategies that the embryo may employ to generate stratified phenotypes at different developmental stages, recognizing that these strategies might evolve over time. Although our focus is on the embryonic developmental stages of the epidermis, this framework may also be extended to investigate transformed cells, such as in cancer, to explore how altered division orientation contributes to precancerous or transformed phenotypes.
Fang, M.; Mao, J.; Donner, T. H.; Stocker, A. A.
Show abstract
Evidence accumulation is a fundamental aspect of human decision-making. However, how the precise temporal structure of evidence shapes the accumulation process has not been systematically studied. As a result, current understanding of evidence accumulation remains largely limited to its time-averaged behavior. We tested human subjects in a visual estimation task in which they inferred the angular position of an unknown source from a noisy stimulus sequence. Introducing systematic temporal perturbations, i.e., breaks of different durations and at different positions in the otherwise regular evidence sequence, revealed that subjects actively compensated for the memory loss endured during the break by dynamically enhancing evidence integration and memory maintenance immediately after the break. We derived a new time-continuous Bayesian updating model that is dynamically constrained by optimal performance-effort trade-offs. With two free parameters determining the overall resource-efficiencies of encoding and memory maintenance, the model accurately predicts the rich dependencies of subjects accumulation behavior on the evidence schedule, including subjects individual tendencies to emphasize either early (primacy) or late (recency) samples in the evidence sequence. Our results suggest that evidence accumulation is a non-stationary, dynamically controlled process that optimally balances the information gained from incoming evidence against the cognitive effort required to acquire and maintain it. The proposed model is general and should apply broadly across many task domains.
Goryanin, I.; Damms, B.; Goryanin, I.
Show abstract
Background: Ageing is a systems level biological process underlying the onset and progression of multiple chronic disorders. Rather than arising from a single pathway, age related decline reflects interacting disturbances in metabolic regulation, inflammation, nutrient sensing, cellular stress responses, and tissue repair. Although GLP1 receptor agonists, sodium glucose cotransporter2 inhibitors, metformin, and rapamycin are usually evaluated against disease-specific endpoints. Objective: To develop an SBML compliant quantitative systems pharmacology model in which ageing is the primary pharmacological endpoint and to evaluate which combination therapy provides the greatest benefit for both metabolic and ageing related outcomes. Methods: We developed model comprising four layers: a metabolic/pharmacodynamic layer describing weight loss, HbA1c reduction, and nausea with tolerance; a drug layer capturing class-specific effects of GLP1 agonists, sodium glucose cotransporter2 inhibitors, metformin, and rapamycin; an ageing layer representing damage accumulation, repair capacity, frailty, and biological age gap; and a biomarker layer generating trajectories and estimated glucose disposal rate. Calibration was staged across semaglutide clinical endpoints. Bayesian hierarchical meta analysis, global sensitivity analysis, and practical identifiability analysis were used to assess robustness and interpretability. Results: The model reproduced semaglutide efficacy and tolerability dynamics and supported distinct drug-class profiles across metabolic and ageing axes. Rapamycin showed minimal glycaemic effect but emerged as a dominant driver of repair related ageing outcomes. Combination simulations predicted two distinct optima: one favouring metabolic improvement and one favouring ageing related benefit. Conclusion: The model supports the view that metabolic and ageing optimization are mechanistically distinct objectives and that weight loss and glycaemic improvement alone may be insufficient surrogates for health span benefit.
Gada, L.; Afuleni, M. K.; Noble, M.; House, T.; Finnie, T.
Show abstract
Knowing the mortality rates associated with infection by a pathogen is essential for effective preparedness and response. Here, harnessing the flexibility of a Bayesian approach, we produce an estimate of the Infection Fatality Ratio (IFR) for A(H5N1) conditional on explicit assumptions, and quantify the uncertainty thereof. We also apply the method to first-wave COVID-19 data up to March 2020, demonstrating the estimates that could be obtained were the model available then. Our analysis uses World Development Indicators (WDI) from the World Bank, the A(H5N1) WHO confirmed cases and deaths tracker by country (2003-2024), and COVID-19 cases and deaths data from John Hopkins University (January and February 2020). Since infectious disease dynamics are typically influenced by local socio-economic factors rather than political borders, individual countries are placed within clusters of countries sharing similar WDIs relevant to respiratory viral diseases, with clusters derived by performing Hierarchical Clustering. To estimate the IFR, we fit a Negative Binomial Bayesian Hierarchical Model for A(H5N1) and COVID-19 separately. We explicitly modelled key unobserved parameters with informative priors from expert opinion and literature. By modelling underreporting, our analysis suggests lower fatality (15.3%) compared to WHO's Case Fatality Ratio estimate (54%) on lab-confirmed cases. However, credible intervals are wide ([0.5%, 64.2%] 95% CrI). Therefore, good preparedness for a potential A(H5N1) pandemic implies adopting scenario planning under our central estimate, as well as for IFRs as high as 70%. Our approach also returns a COVID-19 IFR estimate of 2.8% with [2.5%, 3.1%] 95% CrI which is consistent with literature.
Neumann, O. F.; Kravikass, M.; John, N.; Ramachandran, R. G.; Steinmann, P.; Zaburdaev, V.; Wehner, D.; Budday, S.
Show abstract
Functional spinal cord repair in zebrafish is governed by regeneration-favorable biochemical and mechanical cues within the lesion microenvironment. Alterations in extracellular matrix composition and stiffness are closely associated with axon regeneration. However, experimentally dissecting the interplay between mechanical signals and axonal regrowth in vivo remains technically challenging. Here, we present an agent-based modeling framework to simulate stiffness-mediated axonal growth trajectories across the lesion. We use this model to explore potential mechanisms underlying the characteristic growth patterns observed during zebrafish spinal cord regeneration. Computational predictions were qualitatively compared with confocal imaging data obtained from larval zebrafish. These phenomenological comparisons revealed a close agreement between simulated and experimentally observed axon growth, indicating that experimentally observed patterns could be governed by transient changes in the stiffness profile of the spinal cord and lesion microenvironment. Hence, our computational framework provides an in silico platform for investigating the role of mechanical cues in axon regeneration in the injured spinal cord.
D'Andrea, R.; Kocher, C.; Skiena, B.; Futcher, B.
Show abstract
Animals such as bees, ants, wasps, termites, and naked mole-rats live in colonies in which a single queen is the only female reproductive, an arrangement known as eusociality. Eusocial animals are known for their remarkably long lifespans. It has been argued that longevity becomes selected when queens are shielded from "external mortality". While such protection may contribute, we find a deeper reason: the eusocial reproduction strategy itself inherently creates selection for long lifespans. Lifespans typically reflect two processes: the baseline risk of death and the rate at which this risk increases with age. Each is a parameter in the Gompertz mortality equation. We show that the mathematical properties of eusocial reproduction lead to slowly-growing, older populations where selection acts more strongly on the rate at which risk increases than on the baseline risk. In addition, we show that channeling reproduction through a single female also selects for longevity, which we term the "queen effect". Thus, the dynamics of eusocial reproduction select for longer lifespan. More broadly, these results show that reproductive structure and population growth dynamics can fundamentally shape selection on lifespan, with implications outside eusocial systems as well.
Musonda, R.; Ito, K.; Omori, R.; Ito, K.
Show abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has continuously evolved since its emergence in the human population in 2019. As of 1st August 2025, more than 1,700 Omicron subvariants have been designated by the Pango nomenclature system. The Pango nomenclature system designates a new lineage based on genetic and epidemiological information of SARS-CoV-2 strains. However, there is a possibility that strains that have similar genetic backgrounds and the same phenotype are given different Pango lineage names. In this paper, we propose a new algorithm, called FindPart-w, which can identify groups of viral lineages that share the same relative effective reproduction numbers. We introduced a new lineage replacement model, called the constrained RelRe model, which constrains groups of lineages to have the same relative effective reproduction numbers. The FindPart-w algorithm searches the equality constraints that minimise the Akaike Information Criterion of constrained RelRe models. Using hypothetical observation count data created by simulation, we found that the FindPart-w algorithm can identify groups of lineages having the same relative effective reproduction number in a practical computational time. Applying FindPart-w to actual real-world data of time-stamped lineage counts from the United States, we found that the Pango lineage nomenclature system may have given different lineage names to SARS-CoV-2 strains even if they have the same relative effective reproduction number and similar genetic backgrounds. In conclusion, this study showed that viruses that had the same relative effective reproduction number were identifiable from temporal count data of viral sequences. These findings will contribute to the future development of lineage designation systems that consider both genetic backgrounds and transmissibilities of lineages.
Giri, R.; Agrawal, R.; Lamichhane, S. R.; Barma, S.; Mahatara, R.
Show abstract
We are pleased to submit our Original article entitled "Assessing medication-related burden and medication adherence among older patients from Central Nepal: A machine learning approach" for consideration in your esteemed journal. In this paper, we assessed medication burden using validated Living with medicines Questionnaire (LMQ-3) and medication adherence using Adherence to Medication refills (ARMS) Scale. In this paper we analysed our result through machine learning approach in spite of traditional statistical approach to identify the complex factors influencing both. Six ML architectures (Ordinary Least Square, LightGBM, Random Forest, XGBoost, SVM, and Penalized linear regression) were employed to predict ARMS and LMQ scores using various socio-demographic, clinical and medication-related predictive features. Model explainability was provided through SHAP (Shapley Additive exPlanations). Our study identified the moderate medication burden with moderate non-adherence among older adults. Requiring assistance for medication and polypharmacy were the strongest drivers for the medication burden and non-adherence. The high predictive accuracy by ML suggests the appropriate clinical intervention like deprescribing to cope with the high prevalent medication burden and non-adherence among older adults in Nepal.
Billet, L. S.; Skelly, D. K.; Sauer, E. L.
Show abstract
Pathogens that persist subclinically across many wildlife populations can drive mass mortality in others. Mass mortality is often abrupt, and the timing can be difficult to predict from host or habitat features alone. In a recent field study tracking ranavirus epizootics in wood frog (Rana sylvatica) breeding ponds, we found that no environmental or biotic feature reliably predicted die-off occurrence or timing. Instead, the trajectory of viral accumulation in the water column was the strongest dynamic predictor of mass mortality. Infected hosts shed virus throughout epizootics, but the influence of waterborne viral concentration on disease progression was apparent only near die-off onset. This pattern suggests a potential threshold-dependent feedback operating through the shared viral environment. Here, we develop a compartmental model linking waterborne viral concentration to the rate at which subclinical infections progress to clinical, high-shedding states within already-infected hosts. We show that a dose-dependent progression model generates the two-phase epizootic trajectory observed in natural die-offs: prolonged subclinical circulation followed by abrupt clinical transition after environmental virus crosses an escalation threshold. The model exhibits a sharp phase transition between subclinical circulation and mass mortality, governed mainly by the clinical-to-subclinical shedding ratio, host density, and pond volume. Existing explanations for die-off variation emphasize individual-level susceptibility, but our model demonstrates that dose-dependent environmental feedback, a mechanism not previously formalized at the population level, can generate the transition from subclinical infection to mass mortality without invoking individual variation in host susceptibility. This mechanism may apply in any system where hosts share a bounded environment, pathogen dose influences disease severity, and pathogen shedding increases with disease progression.
Nassinghe, E.; Musinguzi, D.; Takuwa, M.; Kamulegeya, R.; Nabatanzi, R.; Namiiro, S.; Mwikirize, C.; Katumba, A.; Kivunike, F. N.; Ssengooba, W.; Nakatumba-Nabende, J.; Kateete, D. P.
Show abstract
Tuberculosis (TB) is prevalent in Uganda and overlaps with a high rate of HIV/TB coinfection. While nearly all hospital-based TB cases in Kampala, the capital of Uganda, show clear TB symptoms, 30% or more of undiagnosed TB cases found through active screening are asymptomatic. Additionally, the host risk factors for TB in Kampala cannot be distinguished from environmental risk factors. These TB-specific challenges are just part of the complexity, especially in areas with high HIV/AIDS burden. Data science techniques, especially Artificial Intelligence (AI) and Machine Learning (ML) algorithms, could help untangle this complexity by identifying factors related to the host, pathogen, and environment, which are difficult to explain or predict with traditional/conventional methods. In this project, we will use health data science approaches (AI/ML) to identify factors driving TB transmission within households and reasons for anti-TB treatment failure. We will utilize the computational resources at Makerere University and available demographic, clinical, and laboratory data from TB patients and their contacts to develop AI and ML algorithms. These will aim to: (1) identify patients at baseline (month 0) unlikely to convert their sputum or culture results by months 2 and 5, thus at risk of failing TB treatment; (2) identify household contacts of TB cases who are at risk of developing TB disease, as well as contacts who may resist TB infection despite repeated exposure to M. tuberculosis. Achieving these objectives will provide evidence that data science methods are effective for early detection of potential TB cases and high-risk patients, thereby helping to reduce TB transmission in the community. The study protocol received approval from the School of Biomedical Sciences IRB, protocol number SBS-2023-495.
Zhan, Q.; Pascual, M.; He, Q.
Show abstract
Major surface antigens in many pathogens are encoded by rapidly diversifying multigene families, generating fitness variation through antigenic and functional differences. These variations align with the niche and absolute fitness axes of Modern Coexistence Theory (MCT). Yet, how such gene families evolve along these axes under competition for hosts and across transmission gradients remains poorly understood, as prior MCT studies have not explicitly accounted for evolutionary dynamics in high dimensions. We use a stochastic computational model of Plasmodium falciparum transmission to examine how transmission intensity and selection shape var multigene family evolution and composition within parasite genomes. Results show that selection alone cannot maintain the observed stable ratio of two gene groups within parasite genomes, indicating that group-based classifications do not clearly reflect transmission strategy or virulence. When a trade-off exists between diversification rates and absolute fitness, strong immune selection under high transmission favors fast-recombining genes while attenuating functional selection on R0-associated traits. In general, stronger immune selection increases the invasion probability of novel antigens and the niche differentiation among parasite genomes, while reducing the variance in gene-level transmissibility and expression duration, and therefore R0. This outcome, combining enhanced niche differentiation and reduced absolute fitness variation, departs from MCT predictions.
CHOUHAN, P.; Zavala-Romero, O.; Haseeb, M.
Show abstract
Invasive insect species pose serious threats to agriculture and ecosystems, with their spread increasingly accelerated by global trade and climate change. To support prevention and mitigation efforts, it is essential to map the regions where these pests can survive and thrive. Here, we apply MaxEnt, a leading species distribution modeling framework, to estimate current (2020) and future (2040-2060) suitable habitats for five major invasive insects across the contiguous United States: brown marmorated stink bug, corn earworm, spongy moth, root weevil, and spotted lanternfly. To account for an uncertain climatic future, these projections are generated under four shared socioeconomic pathways, which reflect a range of plausible climate change scenarios. Beyond forecasting distributions, we examine several key modeling decisions, especially those often overlooked in practice. In particular, we find that background sampling strategies play a critical role in model calibration and that a hybrid sampling approach with a moderate buffer bias provides better predictive accuracy. We also show that permutation importance scores, commonly used to rank environmental variables, are highly sensitive to small changes in the background data and should be interpreted with caution. Finally, to bridge the gap between ecological modeling and applied machine learning, we provide a self-contained, math-focused background to MaxEnt aimed at practitioners outside of traditional ecological fields. Overall, this work delivers reproducible modeling workflows and critical insights into building robust, transparent, and ecologically meaningful MaxEnt models for climate-informed species distribution analysis.
Harbert, R. A.; Kovarovic, K.; Gruwier, B.
Show abstract
Dental morphology and wear patterns provide insight into the dietary adaptations and ecological niches of living and extinct herbivores. Traditional classification statistics such as Linear Discriminant Analysis (LDA) are limited by assumptions of linearity, normality, and homoscedasticity. This study quantifies mesowear, the shape of molar cusps resulting from occlusal wear, and evaluates the performance of non-linear machine learning models in predicting herbivore diets based on geometric morphometric (GMM) data from adult mandibular second molars (M2) in bovids. We applied Generalized Procrustes Analysis and Principal Component Analysis (PCA) to digitized occlusal shape coordinates from 132 M2 specimens across 64 species. Using the resulting principal component scores, we compared the classification accuracy of LDA with three non-linear models: Random Forest, K-Nearest Neighbors, and Gradient Boosting. While LDA achieved a cross-validated accuracy of just 31%, all non-linear models achieved 99% cross-validation accuracy and 90% test accuracy, demonstrating substantially improved performance. Misclassification analyses revealed that non-linear models more effectively captured complex shape differences, particularly among species with overlapping wear patterns. Our findings support the integration of machine learning with geometric morphometrics to quantify mesowear and improve dietary classification, providing a framework for robust paleoecological inference.
Zhang, E. R.; Mermer, O.; Demir, I.
Show abstract
Road traffic accidents represent a global public safety crisis, necessitating advanced computational tools for accurate injury severity prediction and effective decision support. This study evaluates high-performing ensemble machine learning models, including AdaBoost, XGBoost, LightGBM, HistGBRT, CatBoost, Gradient Boosting, NGBoost, and Random Forest, using a comprehensive National Highway Traffic Safety Administration (NHTSA) dataset from 2018 to 2022. While all models demonstrated exceptional predictive accuracy, with HistGBRT achieving the highest overall accuracy of 92.26%, a defining achievement of this work is the perfect classification (100% precision and recall) of fatal injuries across all ensemble architectures. To bridge the gap between predictive performance and actionable intelligence, this research integrates SHapley Additive exPlanations (SHAP) to provide both global insights into dataset-wide risk factors and local, instance-specific rationales for individual crash events. The global analysis identified ethnicity, airbag deployment, and harmful event type as primary drivers of injury severity, while local force and waterfall plots revealed the precise "push and pull" of variables for specific incidents. The results offer a robust, interpretable framework for stakeholders tasked with improving traffic safety and mitigating crash-related harm.
Deng, F.; Li, H.; Sun, D.; Duan, G.; Sun, Z.; Xue, G.
Show abstract
High level of protein expression is usually welcomed in industry and research, and codon optimization is widely used to achieve high expression. Methods of implementing codon optimization can be divided into two branches, one is classical methods which develop cost functions based on empirical law, another is AI methods which learn the codon choice principles from endogenous genes with neural networks. Here we develop two codon optimization tools based on two branches respectively, namely OptimWiz 2.1 and OptimWiz 3.0. Results of fusion protein fluorescence detection indicate that both OptimWiz 2.1 and OptimWiz 3.0 are superior to all the other commercially available codon optimization tools. Principles of codon optimization are revealed in the process of machine learning on both tools.
Marzban, S.; Robertson-Tessi, M.; West, J.
Show abstract
Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. The mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with combination scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved combination scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and primitive-state persistence together with differentiation-directed exit for MRD. Thus, the mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.
Luty, M. T.; Borah, D.; Szafranska, K.; Giergiel, M.; Trzos, K.; McCourt, P.; Lekka, M.; Kotlinowski, J.; Zapotoczny, B.
Show abstract
Background and AimsFenofibrate is widely prescribed for hyperlipidaemia and has been associated with rare but severe cases of drug-induced liver injury (DILI), yet its effects on liver sinusoidal endothelial cells (LSECs) remain to be investigated. LSECs maintain a highly permeable specialized sinusoidal barrier characterized by transcellular pores (fenestrations), regulating the bidirectional transfer of circulating compounds to and from the hepatocytes. As drug-induced alterations in fenestration architecture could influence xenobiotic access to hepatocytes, these changes may modulate pathways associated with DILI. Understanding the effects of fenofibrate on LSEC ultrastructure may therefore provide insights into previously underexplored endothelial contributions to hepatic drug responses. MethodsBoth fenofibrate and its active metabolite, fenofibric acid, were evaluated for their effects on LSEC ultrastructure, mechanical properties, and functional markers. Atomic force microscopy (AFM) and scanning electron microscopy (SEM) and were used to quantify fenestration architecture. AFM was additionally used to measure cellular mechanical properties, which were interpreted in the context of fluorescence-based quantification of cytoskeletal organization. Gene expression, viability, and cytotoxicity were assessed using PCR-based and biochemical assays. ResultsFenofibrate reduced fenestration number and porosity at both tested concentration (10, and 25 {micro}M). It also decreased the apparent Youngs modulus of LSECs, accompanied by changes in tubulin and actin architecture, without detectable cytotoxicity. In contrast, treatment with fenofibric acid did not result in significant structural or mechanical effects on LSECs, even at higher concentrations. ConclusionsTogether, these data identify LSECs as a drug-responsive hepatic cell type for fenofibrate, suggesting that LSECs could represent an underrecognized contributor to the complex, multifactorial processes underlying DILI. This work provides a framework for evaluating endothelial contributions to fenofibrate-associated liver effects in more complex models. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=105 SRC="FIGDIR/small/718907v1_ufig1.gif" ALT="Figure 1"> View larger version (51K): org.highwire.dtl.DTLVardef@1d3f60corg.highwire.dtl.DTLVardef@bea13aorg.highwire.dtl.DTLVardef@14b27d8org.highwire.dtl.DTLVardef@124e0d3_HPS_FORMAT_FIGEXP M_FIG Fenofibrate reduces LSEC fenestrations and metabolic activity at higher concentrations, while its metabolite, fenofibric acid, does not affect LSEC, regardless of its concentration. C_FIG